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Study of the characteristic parameters of the normal voices of 

Argentinian speakers 

E. V. BonziPI^^ G. B. Grad,® A. M. Maggi,® M. R. MunoP 


The voice laboratory permits to study the human voices using a method that is objective 
and noninvasive. In this work, we have studied the parameters of the human voice such as 
pitch, formant, jitter, shimmer and harmonic-noise ratio of a group of young people. This 
statistical information of parameters is obtained from Argentinian speakers. 


I. Introduction 

The voice is a multidimensional phenomenon that 
must be evaluated using special tools for determin¬ 
ing acoustic parameters. These parameters are: the 
pitch or voice tone, the timbre, considered as the 
personality of the voice that is particular of each 
person (determined by fundamental frequency, its 
harmonics and formants) and the degree of hoarse¬ 
ness. 

During sustained vibration, the vocal fold will 
exhibit variations of fundamental frequency and 
amplitude; these phenomena are called “frequency 
perturbation” (jitter) and “amplitude perturba¬ 
tion” (shimmer). They reflect fluctuations in ten¬ 
sion and biochemical characteristics of the vocal 
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folds, as well as variation in their neural control 
and the physiological properties of the individuals 
voices. 

The acoustic analysis is one of the major ad¬ 
vances in the study of voice, increasing the accuracy 
of diagnosis in this area. Normal values as stan¬ 
dards are important and necessary to guide voice 
professionals. 

There are not many studies performed for the 
Latin languages m- However, there are several 
of them for the English language, such as those in 

Refs. HHH]. 

In the same way, the software used for voice ther¬ 
apy is in general designed for other languages than 
Spanish. A comparison has been made, though, be¬ 
tween the two vowel systems of English and Span¬ 
ish (the variation spoken in Madrid, Spain), which 
triggered relatively large versus small vowel inven¬ 
tories [9]. That is the reason why we consider it 
is very important and necessary to produce more 
results for the Spanish speaking population. 

We analyzed 72 audio files of female and male 
voices from an Argentinian Spanish speaking pop¬ 
ulation to obtain the acoustical parameters using 
the Praat program uni. Our data were compared 
to Bradlow [9], Hualde m and Casado Morente et 
al. [12]. The pitches measured were lower than ex¬ 
pected and the First formant of the /a/ and /u/ 
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Figure 1: Wave shape of the /a/ sound. 



Figure 2: Wave shape of the /i/ sound. 



Figure 3: Harmonics of the /a/ vowel. 


60 


Figure 4: Harmonics of the /i/ vowel. 



vowels is higher than the published data. Addi¬ 
tionally, the Harmonic to Noise Ratio (HNR) values 
discriminated per vowel are presented. 

II. Measurement methodology 

Pitch, First and Second formants. Jitter, Shimmer 
and Harmonic to Noise Ratio (HNR) are the cor¬ 
nerstones of acoustic measurement of voice signals, 
and are often regarded as indices of the perceived 
quality of both normal and pathological voices [13] . 

In this work, we analyzed the audio files from the 
five Spanish vowels produced by 72 female and male 
individuals, in order to study the parameters previ¬ 
ously mentioned. The individuals are Argentinian 
university students whose ages range between 20 
and 30, coming from different regions without any 
special geographical distribution. 

The voices were recorded using a Behringer C-IU 
(USB) cardioid microphone and a notebook. 

The microphone was placed at a distance of 10 
cm respect to the mouth of the subjects while 
they were pronouncing the vowels with an inten¬ 
sity and tone that was comfortable in an acousti¬ 
cally treated room. Each sound was sustained for. 


at least, five seconds. 

The Praat program, commonly used in linguistics 
for the scientific analysis of the human voice m, 
was used to record, analyze the wav files and obtain 
all the parameters presented in this work. A sample 
rate of 44100 Hz was used to record the sound file. 

The wave shapes of the sounds corresponding to 
/a/ and /i/ vowels are shown in Figs. [T]and[2j In 
Figs. [3]and|H the harmonic components obtained 
by applying Fourier Transform to the respective 
vowel signal are shown. 

Pitch 

The pitch is a perceptual attribute of sound 
closely related to frequency, being this perception 
a subjective notion. 

In psychoacoustics, the pitch is related to the 
fundamental frequency of vibration of the vocal 
cords, allowing the perception of the tone fre¬ 
quency. 

Nevertheless, for Praat program [10], the pitch 
is coincident with the fundamental harmonic of the 
wave and we used this definition in this work. 

This parameter depends on gender, being higher 
for women and lower for men. 
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Formants 

The voice is created in the vocal cord, shaped as 
complex sound with harmonics and modified in the 
vocal tract by the resonating frequencies. Then, the 
amplitude of harmonics frequencies are enveloped 
forming a spectrum of energy, the peaks or max¬ 
imum observed in these spectra are named “for¬ 
mants.” Consequently, a formant is a concentration 
of acoustic energy around a particular frequency in 
the speech wave. There are several formants, each 
one at a different frequency corresponding to a res¬ 
onance in the vocal tract, and especially the first 
two are related to the movement of the tongue. 
The high-low magnitude of the First one (FI) is 
inversely related to the up-down tongue position 
and the Second formant (F2) is related to the front 
tongue position. 

Jitter and Shimmer 

The naturalness factor of sustained vowels is at¬ 
tributed to a fundamental frequency and the sig¬ 
nal amplitude. Still there are unwanted variations 
in time of the sound signal properties in the voice 
production. 

While jitter indicates the variability or pertur¬ 
bation of fundamental frequency, shimmer refers to 
the same perturbation but, in this case, related to 
amplitude of sound wave, or intensity of vocal emis¬ 
sion. Jitter is affected mainly by lack of control 
of vocal fold vibration and shimmer by reduction 
of glottic resistance and mass lesions in the vocal 
folds, which are related to the presence of noise at 
emission and breathiness [IQIIII]. 

Harmonic to Noise Ratio - HNR 

The amount of energy conveyed in the funda¬ 
mental frequency (/o) and its harmonics, divided 
by the energy in noise frequencies, is defined as 
the harmonic-to-noise ratio. Frequencies that are 
not integer multiples of /o are regarded as noise. 
This parameter is related to the perception of vo¬ 
cal roughness and hoarseness pT] . 

Normal voices have a low level of noise and high 
HNR. On the contrary, the degree of hoarseness 
increases the noise component and decreases HNR. 

III. Results and Discussion 

The measured data were processed statistically and 
the results are shown in the Tables E H13 El and 
Figs. [5]and[6l 
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Figure 5: Female formant chart. 
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Figure 6: Male formant chart. 

The pitches for female and male individuals are 
shown in Table [H We used the minimum and max¬ 
imum values to address the dispersion instead of 
the standard deviation because the data distribu¬ 
tion was not normal. Our values are in general 
lower for both genders compared to the published 
data p fTTlfT^ . 

Tables [2] and [3] show the First and Second for¬ 
mants values and Figs. [5] and [6] show the chart of 
formants corresponding to female and male popu¬ 
lations obtained in this work. 

We have compared our male results with formant 
data of male Spanish speakers published by Brad- 
low [9]. 

In general, the First (FI) and Second (F2) for¬ 
mants values are comparable to the published ones. 

In particular, the FI formants for the /a/ and 
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Female 

Male 

Maximum 

314 

196 

Medium 

225 

128 

Minimum 

155 

85 


Table 1: Pitch values of female and male subjects 
in Hz. 

/u/ vowels are higher than the reported ones, 12 
and 21 %, respectively. 

The Second formant, F2, for the /o/ vowel is 
lower than Bradlow by 12 %. 

On the other hand, we cannot compare our fe¬ 
male formant values with published results because 
we could not find results for female individuals in 
the literature. Comparing female versus male FI 
formants, we observed that most of them are higher 
by 20 % but in the case of the /o/ vowel the differ¬ 
ence is 11 %. 

Comparing F2 formants, the female values are 
higher than the male ones, reaching almost the 25 
% for /a/ and /i/ vowels. 

Furthermore, the F2 of the /u/ vowel in our sam¬ 
ples show an important scatter for both genders, 
female and male. 

In the Tables |4] and O the obtained Jitter and 
Shimmer values for each vowel are shown. They 
are comparable to the Jitter and Shimmer aver¬ 
ages obtained by Casado Morente et al. [12] in a 
study that involves a group of normal people. In 
our work, we have observed that the Jitter and the 
Shimmer values of the /a/ vowel are bigger than 
the corresponding ones of the other vowels. 

Finally, the HNR results, see Table [6l are ac¬ 
cording to the average value presented by Casado 
Morente et al. [T2|. However, we could not find 
in the bibliography the HNR values for each of the 
five Spanish vowels, so we had to make the compar¬ 
ison with the average of them. In the present work, 
we have found that the vowels show an increasing 
HNR value from /a/ to /u/, meaning that /u/ has 
better signal to noise ratio than the other vowels. 

IV. Concluding remarks 

The objective of this research was to measure 
acoustical properties of the Spanish voices of Ar¬ 
gentinian speakers. 


Vowels 

FI [H 


F2 [Hz] 

A/ 

370 


45 

2600 


no 

/e/ 

525 


40 

2300 


130 

/«/ 

900 


55 

1500 


100 

/V 

550 


40 

1000 


80 

In/ 

440 


40 

1150 


430 


Table 2: First and Second formant of female. 



Vowels 

FI [Hz] 

F2 [Hz] 


A/ 

300 ± 25 

2220 ± 100 


/A 

450 ± 35 

1935 ± 90 


/«/ 

715 ± 55 

1260 ± 60 


/A 

490 ± 35 

900 ± 45 


/A 

390 ± 45 

970 ± 430 

Table 3: First and Second formant of male. 

Vowels 

Shimmer Local [%] 

Jitter Local [%] 

/«/ 


2.7 ± 1.1 

0.31 ± 0.10 

/e/ 


2.1 ± 0.7 

0.28 ± 0.08 

lil 


2.2 ± 0.6 

0.29 ± 0.07 

lol 


2.0 ± 0.7 

0.26 ± 0.11 

!%! 


2.1 ± 0.7 

0.27 ± 0.09 


Table 4: Shimmer and Jitter of female subjects. 


Vowels 

Shimmer Local [%] 

Jitter Local [%] 

/A 

3.0 ± 0.9 

0.36 zb 0.10 

/A 

2.3 ± 0.8 

0.33 zb 0.09 

/A 

2.3 ± 0.7 

0.28 zb 0.08 

/A 

2.2 zb 0.8 

0.29 ± 0.10 

/«/ 

2.3 ± 0.9 

0.25 ± 0.07 


Table 5: Shimmer and Jitter of male subjects. 


Vowels 

Female 

Male 

/A 

21 

zb 

3 

20 

zb 

2 

/A 

20 


2 

21 

zb 

2 

/A 

22 


3 

22 

zb 

2 

/A 

25 


3 

24 

zb 

3 

/«/ 

25 


4 

25 

zb 

3 


Table 6: Harmonic to Noise Ratio of female and 
male subjects in dB. 
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These voice parameters are generally assessed 
subjectively by several authors. This form of per¬ 
ceptual analysis of voice has significant limitations 
and the subtle interpretative judgments of verbal 
classifications may not be accurate. 

The differences we found in the parameters of 
the vowels measured in a group of people from Ar¬ 
gentina compared to the parameters obtained from 
Spanish speaking people living in Spain suggests 
the region of study has an important influence in 
the results, as expected. 

This kind of studies are very useful to compare 
the properties of normal and pathological voices of 
people from different regions. 

It is necessary to test the same parameters in 
female Spanish speakers as well. 

Such work should be performed in larger quan¬ 
tities and should be extended to other countries or 
regions of Latin America, especially where different 
ethnic groups can be found. 
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